Relevance-Ranked Domain-Specific Synonym Discovery
نویسندگان
چکیده
Interest in domain-specific search is growing rapidly, creating a need for domain-specific synonym discovery. The best-performing methods for this task rely on query logs and are thus difficult to use in many circumstances. We propose a method for domain-specific synonym discovery that requires only a domain-specific corpus. Our method substantially outperforms previously proposed methods in realistic evaluations. Due to the difficulty of identifying pairs of synonyms from among a large number of terms, methods have traditionally been evaluated by their ability to choose a target term's synonym from a small set of candidate terms. We generalize this evaluation by evaluating methods’ performance when required to choose a target term's synonym from progressively larger sets of candidate terms. We approach synonym discovery as a ranking problem and evaluate the methods' ability to rank a target term's candidate synonyms. Our results illustrate that while our proposed method substantially outperforms existing methods, synonym discovery is still a difficult task to automate and is best coupled with a human moderator.
منابع مشابه
Ranking and Selecting Synsets by Domain Relevance
The paper presents a novel method for domain specific sense assignment. The method determines the domain specific relevance of GermaNet synsets on the basis of the relevance of their constituent terms that cooccur within representative domain corpora. The approach is task independent and completely automatic. Experiments show results on three selected domains: business, soccer and medical.
متن کاملExtending Synsets with Medical Terms
An important problematic issue with general semantic lexicons like WordNet or GermaNet is that they do not cover many terms and concepts specific to certain domains. Therefore, these resources need to be tuned to a specific domain at hand. This involves selecting those senses that are most appropriate for the domain, as well as extending the sense inventory with novel terms and novel senses tha...
متن کاملSwanson linking revisited: Accelerating literature-based discovery across domains using a conceptual influence graph
We introduce a modular approach for literature-based discovery consisting of a machine reading and knowledge assembly component that together produce a graph of influence relations (e.g., “A promotes B”) from a collection of publications. A search engine is used to explore direct and indirect influence chains. Query results are substantiated with textual evidence, ranked according to their rele...
متن کاملSubgroup Discovery in Ranked Data, with an Application to Gene Set Enrichment
We investigate a class of problems that deal with ranked data. Such data can be found in a variety of domains, ranging from inherently competitive fields such as sports and business, to more surprising applications such as relevance ranking and temporal data (where more recent events rank higher). In this paper, we deal with ranked data in a Subgroup Discovery setting, where we are looking to f...
متن کاملExpert Discovery: A web mining approach
Expert discovery is a quest in search of finding an answer to a question: “Who is the best expert of a specific subject in a particular domain within peculiar array of parameters?” Expert with domain knowledge in any field is crucial for consulting in industry, academia and scientific community. Aim of this study is to address the issues for expert-finding task in real-world community. Collabor...
متن کامل